A Formal Comparison of Visual Web Wrapper Generators

نویسندگان

  • Georg Gottlob
  • Christoph Koch
چکیده

We study the core fragment of the Elog wrapping language used in the Lixto system (a visual wrapper generator) and formally compare Elog to other wrapping languages proposed in the literature.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Extraction using Content-Based Handles

In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...

متن کامل

Chapter 3 . 24 XWRAPComposer : A Multi - Page Data Extraction Service

We present a service-oriented architecture and a set of techniques for developing wrapper code generators, including the methodology of designing an effective wrapper program construction facility and a concrete implementation, called XWRAPComposer. Our wrapper generation framework has two unique design goals. First, we explicitly separate tasks of building wrappers that are specific to a Web s...

متن کامل

A Multi-Page Data Extraction Service

We present a service-oriented architecture and a set of techniques for developing wrapper code generators, including the methodology of designing an effective wrapper program construction facility and a concrete implementation, called XWRAPComposer. Our wrapper generation framework has two unique design goals. First, we explicitly separate tasks of building wrappers that are specific to a Web s...

متن کامل

Intelligent Wrapping of Information Sources: Getting Ready for the Electronic Market

Literature search and delivery in the World Wide Web becomes a rapidly expanding market. Up to now the search is mostly cost-free. But in the future we expect the appearance of more and more providers charging for their services. The main problems are finding the right provider and extracting the information. UniCats is a system for intelligent information search and extraction from multiple pr...

متن کامل

MNTG: An Extensible Web-Based Traffic Generator

Road network traffic datasets have attracted significant attention in the past decade. For instance, in spatio-temporal databases area, researchers harness road network traffic data to evaluate and validate their research. Collecting real traffic datasets is tedious as it usually takes a significant amount of time and effort. Alternatively, many researchers opt to generate synthetic traffic dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006